[ Info: Precompiling TDAweb [33e9cfd9-3fd8-44a9-876f-d405e84153eb] (cache misses: include_dependency fsize change (2))
SYSTEM: caught exception of type :MethodError while trying to print a failed Task notice; giving up
WARNING: Imported binding PersistenceDiagrams.Silhouette was undeclared at import time during import to TDA.
WARNING: Detected access to binding `WebIO.webio_serve` in a world prior to its definition world.
Julia 1.12 has introduced more strict world age semantics for global bindings.
!!! This code may malfunction under Revise.
!!! This code will error in future versions of Julia.
Hint: Add an appropriate `invokelatest` around the access to this binding.
To make this warning an error, and hence obtain a stack trace, use `julia --depwarn=error`.
2 Dataset Description
2.1 Sample Composition
This analysis uses N=25 spider web images collected from spiders exposed to different agricultural chemicals:
Group
N
Drug Type
Mechanism of Action
CONTROL
5
None
Baseline web structure
CIPERMETRINA
5
Insecticide (Pyrethroid)
Synthetic pyrethroid; disrupts sodium channels causing tremors and impaired motor control
ENDOSULFAN
5
Insecticide (Organochlorine)
GABA antagonist; causes seizures and neurological disruption (banned in many countries)
GLIFOSATO
5
Herbicide (Glyphosate)
Glycine analog; disputed neurotoxicity in arthropods
This limits biological interpretation and generalizability of findings.
2.2 Biological Context: Why Study Spider Webs?
Spider web construction is a sensitive bioassay for neurotoxicity. Building a web requires:
Precise motor control: Accurate silk placement and anchor point selection
Spatial memory: Following geometric patterns and maintaining symmetry
Proprioception: Body position awareness during construction
Drugs affecting the nervous system disrupt these processes, manifesting as structural changes visible in the web geometry.
2.2.1 Historical Precedent
Spider web pharmacology dates to 1948 (Witt et al.): drugs like caffeine, LSD, and marijuana produce characteristic web deformations. Modern applications include:
✗ NOT definitive biological conclusions ✗ NOT generalizable without replication ✗ NOT powered for detecting subtle effects
3.2.4 Recommendations for Future Work
Minimal viable study: N ≥ 20 per group
Well-powered study: N ≥ 30 per group
Independent validation cohort: 70/30 train/test split or multi-site data
Results presented here represent upper bounds on performance (likely optimistic due to overfitting) and should guide future adequately-powered studies.
3.3 Point Cloud Sampling
We extract 1000 points from each web using the farthest point sample algorithm.
In [4]:
samples =map(Xs) do Xfarthest_points_sample(X, 1000)end;
4 Persistence Diagrams
4.1 Compute Rips Filtration
In [5]:
bcs =map(samples) do srips_pd(s, cutoff =5)end;
4.2 Extract H0 and H1
In [6]:
# H0: connected components (web fragmentation)# H1: loops/cycles (cells in web)# Ripserer returns tuple of (H0, H1, ...) diagrams# Safely extract diagrams - handle cases where diagrams might be empty or missingbcs_0 = [length(bc) >=1 ? bc[1] : [] for bc in bcs]bcs_1 = [length(bc) >=2 ? bc[2] : (length(bc) >=1 ? bc[end] : []) for bc in bcs]println("Number of samples: ", length(bcs))println("H0 diagram example (first non-empty): ")for (i, bc) inenumerate(bcs_0)if !isempty(bc)println(" Sample $i: ", length(bc), " features")breakendendprintln("H1 diagram example (first non-empty): ")for (i, bc) inenumerate(bcs_1)if !isempty(bc)println(" Sample $i: ", length(bc), " features")breakendend;
Number of samples: 25
H0 diagram example (first non-empty):
Sample 1: 1000 features
H1 diagram example (first non-empty):
Sample 1: 71 features
4.3 Representative Web Examples
Before diving into feature extraction and statistics, let’s visualize representative spider webs from each treatment group along with their persistence diagrams. Each panel shows:
Top left: Persistence diagram (birth-death plot showing H1 cycles)
Top right: Web image intensity heatmap
Bottom: Point cloud sample used for TDA computation
In [7]:
# Show one representative web per group (first sample from each)for sp in unique_species idx =findfirst(==(sp), species)if !isnothing(idx) && !isempty(bcs_1[idx]) p =plot_wing_with_pd(bcs_1[idx], As[idx], samples[idx], sp)display(p)endend
5 Feature Extraction
Understanding TDA Features
We extract several statistical summaries from each persistence diagram. Here’s what each feature measures and how it relates to web structure:
Feature
What it measures
Web interpretation
n_features
Number of H1 cycles detected
Number of closed cells in the web
total_persistence
Sum of all cycle lifespans
Overall topological complexity
median_persistence
Median cycle lifespan
Typical cell “robustness” (robust to outliers)
max_persistence
Largest cycle lifespan
Most prominent hole or cell
entropy
Uniformity of cycle lifespans
High = regular cells; Low = irregular cells
median_birth
Median scale at which cycles appear
Typical cell size (robust to outliers)
These features transform complex persistence diagrams into interpretable numbers that can be compared statistically across treatment groups.
5.1 Rich Statistics - H1 (Cycles)
In [8]:
stats_h1 = [rich_stats(bc) for bc in bcs_1];DataFrame( Specie = species, n_features = [s.n_features for s in stats_h1], total_persistence =round.([s.total_persistence for s in stats_h1], digits=3), median_persistence =round.([s.median_persistence for s in stats_h1], digits=3), max_persistence =round.([s.max_persistence for s in stats_h1], digits=3), entropy =round.([s.entropy for s in stats_h1], digits=3))
25×6 DataFrame
Row
Specie
n_features
total_persistence
median_persistence
max_persistence
entropy
SubStrin…
Int64
Float64
Float64
Float64
Float64
1
CIPERMETRINA
71
831.438
9.631
66.449
4.079
2
CIPERMETRINA
69
779.229
8.354
37.007
4.076
3
CIPERMETRINA
66
925.464
11.833
58.847
4.033
4
CIPERMETRINA
69
777.478
10.289
28.721
4.138
5
CIPERMETRINA
50
499.91
8.61
22.183
3.811
6
CONTROL
68
473.123
6.0
19.114
4.159
7
CONTROL
81
577.74
6.403
14.111
4.357
8
CONTROL
138
1086.26
6.606
38.022
4.836
9
CONTROL
108
764.702
6.43
16.632
4.643
10
CONTROL
93
626.066
6.275
12.905
4.507
11
ENDOSULFAN
87
648.155
6.298
31.075
4.379
12
ENDOSULFAN
68
543.676
6.945
22.366
4.145
13
ENDOSULFAN
80
812.353
6.842
80.652
4.111
14
ENDOSULFAN
59
610.397
6.896
58.875
3.785
15
ENDOSULFAN
61
616.653
8.022
55.703
3.941
16
GLIFOSATO
60
965.556
10.522
107.758
3.752
17
GLIFOSATO
43
695.187
10.05
60.592
3.474
18
GLIFOSATO
72
868.896
8.868
79.585
4.049
19
GLIFOSATO
54
695.542
9.893
68.599
3.765
20
GLIFOSATO
67
866.073
9.56
51.756
4.025
21
SPINOSAD
70
793.129
9.768
37.691
4.093
22
SPINOSAD
69
827.155
10.814
53.565
4.099
23
SPINOSAD
72
780.85
8.964
38.603
4.137
24
SPINOSAD
64
747.34
9.938
27.148
4.04
25
SPINOSAD
78
831.255
9.219
57.754
4.216
5.2 H0 (Connected Components) - Not Analyzed
Why H0 is Not Analyzed
All spider webs in this dataset remain structurally connected (single component), meaning H0 persistence provides minimal discriminatory information between treatment groups.
The absence of web fragmentation suggests:
Spiders complete web construction despite drug exposure
Drug effects manifest primarily as topological changes within connected structures (H1 features)
Changes in loop/cell patterns (H1) rather than complete structural breakdown
Therefore, we focus our analysis on H1 (one-dimensional persistence) which captures the relevant differences in cell structure and regularity.
In [9]:
# Compute H0 stats for completeness (not shown in main analysis)stats_h0 = [rich_stats(bc) for bc in bcs_0];
5.3 Feature Matrices
In [10]:
# H1 rich stats matrixX_h1, feature_names_h1 =stats_to_matrix(stats_h1)# H0 rich stats matrixX_h0, feature_names_h0 =stats_to_matrix(stats_h0)# Safe z-score normalization (handles constant columns)functionsafe_zscore(x) s =std(x) s ==0 ? zeros(length(x)) :zscore(x)endX_h1_normalized =mapslices(safe_zscore, X_h1, dims=1)X_h0_normalized =mapslices(safe_zscore, X_h0, dims=1)# Replace any NaN with 0X_h1_normalized[isnan.(X_h1_normalized)] .=0.0X_h0_normalized[isnan.(X_h0_normalized)] .=0.0println("H1 features: ", feature_names_h1)println("Feature matrix size: $(size(X_h1))")
# Full vectorization (persistence images, Betti curves, landscapes, etc.)feature_vectors = [vectorize_diagram(bc) for bc in bcs_1]X_features =reduce(hcat, feature_vectors)'println("Vectorized features dimension: $(size(X_features, 2))")
Vectorized features dimension: 362
6 Exploratory Visualization
6.1 Summary Statistics by Drug
In [12]:
println("Mean Statistics by Group:\n")for sp in unique_species idx =findall(==(sp), species)ifisempty(idx)println("$sp: no samples\n")continueend group_stats = stats_h1[idx] entropies = [s.entropy for s in group_stats] n_feats = [s.n_features for s in group_stats] max_pers = [s.max_persistence for s in group_stats] mean_entropy =isempty(entropies) ? 0.0:mean(entropies) mean_n_features =isempty(n_feats) ? 0.0:mean(n_feats) mean_max_pers =isempty(max_pers) ? 0.0:mean(max_pers)println("$sp:")println(" - Mean cycles (H1): $(round(mean_n_features, digits=1))")println(" - Mean entropy: $(round(mean_entropy, digits=3))")println(" - Mean max persistence: $(round(mean_max_pers, digits=3))")println()end
Mean Statistics by Group:
CIPERMETRINA:
- Mean cycles (H1): 65.0
- Mean entropy: 4.027
- Mean max persistence: 42.642
CONTROL:
- Mean cycles (H1): 97.6
- Mean entropy: 4.5
- Mean max persistence: 20.157
ENDOSULFAN:
- Mean cycles (H1): 71.0
- Mean entropy: 4.072
- Mean max persistence: 49.734
GLIFOSATO:
- Mean cycles (H1): 59.2
- Mean entropy: 3.813
- Mean max persistence: 73.658
SPINOSAD:
- Mean cycles (H1): 70.6
- Mean entropy: 4.117
- Mean max persistence: 42.952
Some drug groups show more heterogeneity in web structure than others. Below we show the most and least complex webs (by entropy) within each group to illustrate this variability. High entropy indicates regular, uniform cell sizes; low entropy indicates irregular cells.
In [15]:
for sp in unique_species idx =findall(==(sp), species)iflength(idx) >=2 group_entropies = [stats_h1[i].entropy for i in idx] sorted_idx = idx[sortperm(group_entropies)] low_i = sorted_idx[1] high_i = sorted_idx[end] p_low =plot_wing_with_pd(bcs_1[low_i], As[low_i], samples[low_i],"$sp - Low entropy (irregular)") p_high =plot_wing_with_pd(bcs_1[high_i], As[high_i], samples[high_i],"$sp - High entropy (regular)") combined = pt.plot(p_low, p_high, layout=(1, 2), size=(1200, 400))display(combined)endend
6.5 Feature Distributions by Group
The boxplots below show how key TDA features are distributed across treatment groups. This helps visualize group differences before formal statistical testing.
In [16]:
p1 =plot_feature_boxplot([s.entropy for s in stats_h1], species, "Entropy (H1)")p2 =plot_feature_boxplot([s.n_features for s in stats_h1], species, "Number of Cycles (H1)")p3 =plot_feature_boxplot([s.max_persistence for s in stats_h1], species, "Max Persistence (H1)")p4 =plot_feature_boxplot([s.total_persistence for s in stats_h1], species, "Total Persistence (H1)")pt.plot(p1, p2, p3, p4, layout=(2, 2), size=(900, 700))
Figure 1: Distribution of key TDA features across treatment groups
7 Distance Analysis
Understanding Wasserstein Distance
The Wasserstein distance (also called Earth Mover’s Distance) measures how different two persistence diagrams are.
Intuition: Imagine each point in a persistence diagram as a pile of dirt. The Wasserstein distance is the minimum “work” needed to transform one diagram into another by moving dirt around.
Why use it for TDA?
Specifically designed for comparing persistence diagrams
Captures both the locations of topological features and how they should be matched
Has metric properties, enabling use with standard machine learning methods (like KNN)
Notation: Wasserstein(p, q) — we use p=1 (sum of movements) and q=2 (Euclidean ground metric).
A small Wasserstein distance means two webs have similar topological structure; a large distance means their persistence diagrams differ substantially.
Understanding MDS (Multidimensional Scaling)
MDS converts a distance matrix into low-dimensional coordinates for visualization:
Start with pairwise distances between all samples
Find 2D or 3D coordinates that preserve these distances as well as possible
Plot the coordinates — samples close together have similar features
How to interpret MDS plots:
Clusters = groups of samples with similar topological features
Separation between clusters = distinct TDA signatures between groups
Overlap between groups = ambiguity; these groups are hard to distinguish topologically
MDS is purely for visualization — it doesn’t make statistical claims, but helps us see patterns before formal testing.
7.2 Distance Metric Comparison: Wasserstein vs Bottleneck
Different distance metrics capture different aspects of topological dissimilarity. We compare two fundamental persistence diagram distances:
Wasserstein vs Bottleneck: Theoretical Differences
Property
Wasserstein W₁
Bottleneck d∞
Definition
Optimal matching cost (total transport)
Worst-case matching cost (max single distance)
Formula
Sum of all point distances in optimal matching
Maximum single point distance in optimal matching
Sensitivity
Sensitive to all points (global measure)
Dominated by outliers (local measure)
Stability
More stable in presence of noise
Can be unstable with outliers
Interpretation
“Average structural difference”
“Maximum local difference”
Computation
O(n³) via Hungarian algorithm
O(n^2.5) via min-cost flow
When to use which? - Wasserstein: When all topological features matter; captures overall structural difference - Bottleneck: When largest discrepancy matters; robust to small noise but sensitive to big changes
println("\n=== Distance Metric Analysis ===\n")if dist_correlation >0.8println("✓ High correlation (r = $(round(dist_correlation, digits=2)))")println(" Metrics largely agree on sample relationships")println(" Either metric is appropriate for this dataset")elseif dist_correlation >0.5println("≈ Moderate correlation (r = $(round(dist_correlation, digits=2)))")println(" Metrics capture somewhat different aspects")println(" Consider using both for complementary information")elseprintln("✗ Low correlation (r = $(round(dist_correlation, digits=2)))")println(" Metrics capture fundamentally different structures")println(" Choice significantly impacts conclusions")endprintln("\nRecommendation:")if acc_wass >= acc_bottprintln("→ Use Wasserstein distance for this dataset")println(" Better classification performance")println(" Captures overall structural differences relevant to drug effects")elseprintln("→ Use Bottleneck distance for this dataset")println(" Better classification performance")println(" Largest topological differences most informative")end
=== Distance Metric Analysis ===
✗ Low correlation (r = 0.21)
Metrics capture fundamentally different structures
Choice significantly impacts conclusions
Recommendation:
→ Use Wasserstein distance for this dataset
Better classification performance
Captures overall structural differences relevant to drug effects
# 3D MDS - Wasserstein (disabled: PlotlyJS output not compatible with Quarto)mds_wass_3d =mds_embedding(D_wasserstein; maxoutdim=3)plot_3d_scatter(mds_wass_3d.embedding, species; title="MDS 3D (Wasserstein H1)")
8 Statistical Tests
Our statistical analysis follows a three-stage approach:
Omnibus test (Kruskal-Wallis): Do ANY groups differ from each other?
Pairwise comparisons (Permutation tests): WHICH drugs differ from control?
Effect sizes (Cohen’s d): HOW MUCH do they differ?
This hierarchical approach controls false positives while providing interpretable effect magnitudes.
What is the Kruskal-Wallis Test?
The Kruskal-Wallis test is a non-parametric alternative to one-way ANOVA. We use it here because:
No normality assumption: Unlike ANOVA, it doesn’t require the data to follow a normal distribution — important for TDA features which may have unusual distributions
Robust to outliers: Uses ranks instead of raw values, so extreme points don’t dominate
Works with small samples: Reliable even with limited data per group
How to interpret the p-value:
p < 0.05: Strong evidence that at least one group differs from the others (marked with *)
p ≥ 0.05: Insufficient evidence to conclude groups differ
Why not use ANOVA? With small sample sizes and potentially non-normal distributions (common in TDA features), Kruskal-Wallis is more reliable and makes fewer assumptions.
8.1 Kruskal-Wallis Tests
In [29]:
println("Kruskal-Wallis Tests for Group Differences:\n")# Test key featuresfeatures_to_test = [ ("entropy", [s.entropy for s in stats_h1]), ("n_features", [s.n_features for s in stats_h1]), ("max_persistence", [s.max_persistence for s in stats_h1]), ("total_persistence", [s.total_persistence for s in stats_h1]),]for (name, values) in features_to_test kw =test_group_differences(values, species) sig = kw.p_value <0.05 ? "*":""println("$name: p = $(round(kw.p_value, digits=4))$sig")end
Kruskal-Wallis Tests for Group Differences:
entropy: p = 0.004 *
n_features: p = 0.0495 *
max_persistence: p = 0.0123 *
total_persistence: p = 0.2011
Understanding Effect Sizes (Cohen’s d)
A p-value tells you if groups differ statistically, but effect size tells you how much they differ in practical terms.
Cohen’s d measures the standardized difference between two group means:
\[d = \frac{\bar{x}_1 - \bar{x}_2}{s_{pooled}}\]
where \(s_{pooled}\) is the pooled standard deviation of both groups.
Interpretation guidelines:
d
value
< 0.2
Negligible
Groups nearly identical
0.2 – 0.5
Small
Detectable but minor difference
0.5 – 0.8
Medium
Noticeable practical difference
> 0.8
Large
Substantial, meaningful difference
Why effect size matters: With large samples, even tiny differences can be “statistically significant” (p < 0.05) but practically meaningless. Effect size helps distinguish meaningful differences from trivial ones.
Understanding Permutation Tests
A permutation test is a non-parametric method to compute p-values without assuming any particular distribution:
Calculate the observed difference between groups (e.g., difference in mean entropy)
Randomly shuffle group labels many times (e.g., 10,000 permutations)
Recalculate the difference after each shuffle
Count how often the shuffled difference exceeds the observed difference
p-value = (count + 1) / (n_permutations + 1)
Advantages:
No distributional assumptions — works for any data
Works with any test statistic
Provides exact p-values even for small samples
Intuitive interpretation: “how often would we see this difference by chance?”
Used here: We compare each drug group to CONTROL using permutation tests to get reliable p-values.
A Note on Multiple Comparisons
When we test multiple features across multiple drug groups, we increase the chance of false positives. With 4 drugs × 3 features = 12 tests at α = 0.05, we expect about 0.6 false positives by chance alone.
Recommendations for interpreting results:
Focus on results with p < 0.01 (more stringent threshold)
Prioritize findings with large effect sizes (|d| > 0.8)
Look for consistent patterns across related features (e.g., both entropy and n_cycles showing similar direction)
Results that meet multiple criteria (low p-value AND large effect size AND consistent pattern) are most reliable.
8.2 Pairwise Drug Comparisons with Effect Sizes
In [30]:
# Compare each drug to CONTROL with effect sizescomparisons =vcat([pairwise_drug_comparison([s.entropy for s in stats_h1], species; feature_name="entropy"),pairwise_drug_comparison([s.n_features for s in stats_h1], species; feature_name="n_cycles"),pairwise_drug_comparison([s.max_persistence for s in stats_h1], species; feature_name="max_persistence"),]...)# Show resultsselect(comparisons, :drug, :feature, :diff_percent => (x ->round.(x, digits=1)) =>:diff_pct,:cohens_d => (x ->round.(x, digits=2)) =>:cohens_d, :effect_size, :p_value)
12×6 DataFrame
Row
drug
feature
diff_pct
cohens_d
effect_size
p_value
SubStrin…
String
Float64
Float64
String
Float64
1
CIPERMETRINA
entropy
-10.5
-2.31
large
0.0077
2
ENDOSULFAN
entropy
-9.5
-1.76
large
0.0304
3
GLIFOSATO
entropy
-15.3
-2.77
large
0.0029
4
SPINOSAD
entropy
-8.5
-2.02
large
0.0158
5
CIPERMETRINA
n_cycles
-33.4
-1.63
large
0.0294
6
ENDOSULFAN
n_cycles
-27.3
-1.27
large
0.0616
7
GLIFOSATO
n_cycles
-39.3
-1.86
large
0.0174
8
SPINOSAD
n_cycles
-27.7
-1.39
large
0.0462
9
CIPERMETRINA
max_persistence
111.5
1.46
large
0.0613
10
ENDOSULFAN
max_persistence
146.7
1.64
large
0.0382
11
GLIFOSATO
max_persistence
265.4
3.16
large
0.0028
12
SPINOSAD
max_persistence
113.1
1.99
large
0.0202
In [31]:
# Bonferroni correction for multiple comparisonsn_tests =nrow(comparisons)bonf_alpha =0.05/ n_testsprintln("Multiple comparison correction:")println(" Number of tests: $n_tests")println(" Bonferroni-corrected α = $(round(bonf_alpha, digits=4))")println(" Comparisons significant after correction: ", sum(comparisons.p_value .< bonf_alpha))
Multiple comparison correction:
Number of tests: 12
Bonferroni-corrected α = 0.0042
Comparisons significant after correction: 2
9 Classification
Beyond hypothesis testing, we can ask: can we automatically identify which drug a spider was exposed to based on its web’s topological features? This is a classification task.
What is KNN (K-Nearest Neighbors)?
KNN is one of the simplest classification algorithms. To classify a new sample:
Compute the distance from the new sample to all training samples
Find the k nearest neighbors (k closest training samples)
Assign the majority class among those neighbors
Key parameter: k (number of neighbors)
Small k (e.g., k=1 or k=3): More sensitive to local patterns, but also to noise
Large k (e.g., k=10+): More robust, but may miss subtle differences
We use k=3 as a balanced choice that captures local structure without being overly sensitive to outliers.
Distance metric matters: We test both Wasserstein distance (comparing persistence diagrams directly) and Euclidean distance (comparing extracted features).
Why Cross-Validation? And What’s LOOCV vs K-Fold?
The problem: If we train and test on the same data, we get overly optimistic accuracy because the model has “seen” the answers. We need to estimate performance on unseen data.
9.0.1 Leave-One-Out Cross-Validation (LOOCV)
Remove one sample from the dataset
Train the model on the remaining n-1 samples
Predict the class of the held-out sample
Repeat for every sample
Accuracy = proportion of correct predictions
Pros: Uses maximum training data; deterministic (same result every time) Cons: Computationally expensive; can have high variance
9.0.2 K-Fold Cross-Validation
Split data into k equal folds (e.g., k=5)
For each fold: train on k-1 folds, test on the remaining fold
Average accuracy across all folds
Pros: Good balance of bias and variance; faster than LOOCV Cons: Results vary slightly depending on random split (we report mean ± std)
9.0.3 Interpreting Results
LOOCV accuracy: Single number, deterministic
K-fold accuracy: Reported as mean ± standard deviation
Higher accuracy = better classification; >50% for 5 classes means better than random guessing (20%)
Understanding Classification Metrics
Beyond overall accuracy, we report per-class metrics:
Metric
What it measures
Formula
Precision
Of samples predicted as class X, how many are truly X?
TP / (TP + FP)
Recall
Of samples truly in class X, how many did we identify?
TP / (TP + FN)
F1 Score
Harmonic mean of precision and recall
2 × (P × R) / (P + R)
Interpreting the confusion matrix:
Diagonal elements: Correct predictions (true positives for each class)
Off-diagonal elements: Errors — reading row i, column j means “sample truly in class i was predicted as class j”
A perfect classifier has all counts on the diagonal
10 Method Comparison: TDA vs Traditional Approaches
To validate that TDA provides unique value beyond mathematical sophistication, we compare against traditional image analysis methods.
10.1 Why Compare Methods?
This comparison answers a critical question for expert reviewers: “Why use TDA when simpler methods might work?”
We test three alternative approaches: 1. PCA on raw pixels: Dimensionality reduction on flattened images 2. Handcrafted features: Domain-informed image statistics 3. TDA methods: Our topological approach (for reference)
What This Comparison Reveals
If TDA outperforms: Topological structure captures information that pixel-level methods miss
If alternatives perform similarly: TDA may be unnecessarily complex for this problem
If PCA dominates: Raw pixel patterns sufficient; topology adds little value
This provides empirical justification (or refutation) of the TDA methodology choice.
10.2 Alternative Feature Extraction
10.2.1 Method 1: PCA on Raw Pixels
In [43]:
usingLinearAlgebra# Flatten images to feature vectors (each image becomes a long vector)println("Extracting raw pixel features...")X_pixels_list = []for img in As# Resize to consistent size and flatten img_resized =imresize(img, (100, 100)) # Reduce to 100x100 for computational efficiencypush!(X_pixels_list, vec(Float64.(img_resized)))endX_pixels =reduce(hcat, X_pixels_list)'println("Raw pixel matrix size: $(size(X_pixels))")println(" ($(size(X_pixels, 1)) samples × $(size(X_pixels, 2)) pixels)")# Apply PCA for dimensionality reductionprintln("\nApplying PCA...")pca_model =fit(PCA, X_pixels', maxoutdim=20)X_pixels_pca =predict(pca_model, X_pixels')'variance_explained =principalratio(pca_model)println("PCA with 20 components:")println(" Variance explained: $(round(variance_explained *100, digits=1))%")println(" Reduced dimensions: $(size(X_pixels_pca, 2))")
Extracting raw pixel features...
Raw pixel matrix size: (25, 10000)
(25 samples × 10000 pixels)
Applying PCA...
PCA with 20 components:
Variance explained: 90.0%
Reduced dimensions: 20
10.2.2 Method 2: Handcrafted Image Features
In [44]:
usingImageFilteringfunctionhandcrafted_features(img::Matrix)# Basic intensity statistics mean_intensity =mean(img) std_intensity =std(img) max_intensity =maximum(img)# Edge density (approximate with gradient magnitude) grad_y =diff(img, dims=1) grad_x =diff(img, dims=2)# Pad to match original size grad_y =vcat(grad_y, zeros(1, size(img, 2))) grad_x =hcat(grad_x, zeros(size(img, 1), 1)) edge_strength =mean(sqrt.(grad_y.^2.+ grad_x.^2))# Spatial distribution (center of mass) h, w =size(img) binary_img = img .>0.1ifsum(binary_img) >0 coords_y = [i for i in1:h, j in1:w if binary_img[i,j]] coords_x = [j for i in1:h, j in1:w if binary_img[i,j]] center_y =mean(coords_y) center_x =mean(coords_x) center_dist =sqrt((center_y - h/2)^2+ (center_x - w/2)^2)# Spread (standard deviation of positions) spread_y =std(coords_y) spread_x =std(coords_x)else center_dist =0.0 spread_y =0.0 spread_x =0.0end# Density (proportion of non-zero pixels) density =mean(img .>0.1) [mean_intensity, std_intensity, max_intensity, edge_strength, center_dist, spread_y, spread_x, density]endprintln("Extracting handcrafted features...")X_handcrafted_list = [handcrafted_features(img) for img in As]X_handcrafted =reduce(hcat, X_handcrafted_list)'feature_names_handcrafted = ["mean_intensity", "std_intensity", "max_intensity","edge_strength", "center_dist", "spread_y", "spread_x", "density"]println("Handcrafted feature matrix size: $(size(X_handcrafted))")println(" Features: ", feature_names_handcrafted)# Normalize handcrafted featuresX_handcrafted_norm =mapslices(safe_zscore, X_handcrafted, dims=1)X_handcrafted_norm[isnan.(X_handcrafted_norm)] .=0.0
# Statistical comparison between best TDA and best alternativebest_tda = comparison_results[comparison_results.method .∈ [["TDA: Rich Stats (H1)", "TDA: Vectorized Diagram"]], :]best_tda_row =first(sort(best_tda, :accuracy_mean, rev=true))best_alternative = comparison_results[.!(comparison_results.method .∈ [["TDA: Rich Stats (H1)", "TDA: Vectorized Diagram"]]), :]best_alt_row =first(sort(best_alternative, :accuracy_mean, rev=true))println("\n=== Key Findings ===\n")println("Best TDA method: $(best_tda_row.method)")println(" Accuracy: $(round(best_tda_row.accuracy_mean *100, digits=1))% ± $(round(best_tda_row.accuracy_std *100, digits=1))%")println()println("Best alternative: $(best_alt_row.method)")println(" Accuracy: $(round(best_alt_row.accuracy_mean *100, digits=1))% ± $(round(best_alt_row.accuracy_std *100, digits=1))%")println()advantage = best_tda_row.accuracy_mean - best_alt_row.accuracy_meanprintln("TDA advantage: $(round(advantage *100, digits=1)) percentage points")if advantage >0.05println("\n✓ TDA OUTPERFORMS traditional methods")println(" Topological features capture structure that pixel-level analysis misses")elseif advantage <-0.05println("\n⚠ Traditional methods OUTPERFORM TDA")println(" Consider simpler approaches for this problem")elseprintln("\n≈ Methods perform SIMILARLY")println(" TDA provides interpretability without accuracy cost")end
=== Key Findings ===
Best TDA method: TDA: Rich Stats (H1)
Accuracy: 40.0% ± 14.1%
Best alternative: Handcrafted Features
Accuracy: 48.0% ± 11.0%
TDA advantage: -8.0 percentage points
⚠ Traditional methods OUTPERFORM TDA
Consider simpler approaches for this problem
10.5 Strengths and Weaknesses of Each Approach
Method
Strengths
Weaknesses
Best Use Case
TDA Rich Stats
• Interpretable features • Topologically invariant • Few features (low overfitting)
• Loses spatial info • Requires TDA expertise
Small samples, need interpretability
TDA Vectorized
• Captures full diagram • Multi-scale information
• High-dimensional • Harder to interpret
Large samples, complex structure
PCA on Pixels
• Simple baseline • No domain knowledge needed
• Sensitive to rotation/translation • High-dimensional input
Quick baseline, large datasets
Handcrafted Features
• Fast computation • Domain-informed
• Requires expert feature engineering • May miss subtle patterns
When domain knowledge available
Why This Matters for Expert Review
Comparing methods demonstrates that:
TDA is not arbitrary: Empirical evidence shows whether topology adds value
Interpretability vs accuracy trade-off: TDA rich stats offer interpretable features with competitive accuracy
Small sample robustness: With N=25, lower-dimensional TDA features (12 dims) may generalize better than high-dimensional pixel features (10,000 dims → 20 PCA components)
This comparison strengthens the methodological contribution by showing TDA provides unique value rather than just mathematical sophistication.
This section provides rigorous statistical evidence for two key hypotheses:
CONTROL is clearly separable from all drug-treated groups
Drug classes are NOT easily separable from each other
13.1 Distance Combination
We combine Wasserstein distance (topological structure) with Euclidean distance (rich statistics features) to potentially improve classification.
In [51]:
# Optimize combination of Wasserstein and Euclidean distancesalpha_results =optimize_alpha(D_wasserstein, D_rich_stats, species; alpha_range=0.0:0.05:1.0, k=3)println("=== Distance Combination Optimization ===")println("Best alpha: $(alpha_results.best.alpha)")println("Best accuracy: $(round(alpha_results.best.accuracy *100, digits=1))%")println("\nInterpretation:")println(" alpha = 1.0 means pure Wasserstein distance")println(" alpha = 0.0 means pure Euclidean (rich stats) distance")
=== Distance Combination Optimization ===
Best alpha: 0.5
Best accuracy: 44.0%
Interpretation:
alpha = 1.0 means pure Wasserstein distance
alpha = 0.0 means pure Euclidean (rich stats) distance
PERMANOVA tests whether group centroids differ significantly in multivariate space. It works directly on the Wasserstein distance matrix.
13.4.1 Control vs Drugs
In [64]:
permanova_ctrl =permanova_control_vs_drugs(D_wasserstein, species; n_permutations=9999)println("=== PERMANOVA: Control vs Drugs ===")println("Pseudo-F: $(round(permanova_ctrl.f_statistic, digits=2))")println("p-value: $(round(permanova_ctrl.p_value, digits=4))")println()if permanova_ctrl.p_value <0.05println("✓ CONTROL centroid significantly differs from DRUG centroid (p < 0.05)")elseprintln("✗ No significant difference detected")end
=== PERMANOVA: Control vs Drugs ===
Pseudo-F: 10.81
p-value: 0.0001
✓ CONTROL centroid significantly differs from DRUG centroid (p < 0.05)
13.4.2 Drug Equivalence Test
Testing whether drug groups differ from each other (excluding CONTROL).
In [65]:
drug_equiv =drug_equivalence_test(D_wasserstein, species; n_permutations=9999)println("=== PERMANOVA: Among Drugs Only ===")println("Pseudo-F: $(round(drug_equiv.f_statistic, digits=2))")println("p-value: $(round(drug_equiv.p_value, digits=4))")println()println("Interpretation: $(drug_equiv.interpretation)")
=== PERMANOVA: Among Drugs Only ===
Pseudo-F: 3.25
p-value: 0.0001
Interpretation: Some drug differences detected
13.4.3 Pairwise Drug Comparisons
Testing each pair of drugs to see if they can be statistically distinguished.
In [66]:
# Use entropy as a key featuredrug_perm_tests =pairwise_drug_permutation_tests([s.entropy for s in stats_h1], species; n_permutations=10000)println("=== Pairwise Drug Permutation Tests (Entropy) ===")drug_perm_tests
=== Pairwise Drug Permutation Tests (Entropy) ===
6×6 DataFrame
Row
drug1
drug2
mean_diff
p_value
significant
interpretation
String
String
Float64
Float64
Bool
String
1
CIPERMETRINA
ENDOSULFAN
0.0450259
0.70443
false
NOT distinguishable
2
CIPERMETRINA
GLIFOSATO
0.214446
0.060494
false
NOT distinguishable
3
CIPERMETRINA
SPINOSAD
0.089826
0.19618
false
NOT distinguishable
4
ENDOSULFAN
GLIFOSATO
0.259471
0.105689
false
NOT distinguishable
5
ENDOSULFAN
SPINOSAD
0.0448001
0.716228
false
NOT distinguishable
6
GLIFOSATO
SPINOSAD
0.304272
0.00939906
true
distinguishable
13.5 Confusion Analysis
Which classes are most often confused with each other?
In [67]:
confusion_df =pairwise_confusion_analysis(species, result_knn_wass.predictions)println("=== Top Confusion Pairs ===")first(confusion_df, 10)
=== Top Confusion Pairs ===
10×4 DataFrame
Row
true_class
predicted_class
confusion_rate
count
String
String
Float64
Int64
1
CIPERMETRINA
SPINOSAD
60.0
3
2
ENDOSULFAN
CIPERMETRINA
40.0
2
3
ENDOSULFAN
CONTROL
40.0
2
4
GLIFOSATO
SPINOSAD
40.0
2
5
CIPERMETRINA
ENDOSULFAN
20.0
1
6
CIPERMETRINA
GLIFOSATO
20.0
1
7
CONTROL
ENDOSULFAN
20.0
1
8
GLIFOSATO
CIPERMETRINA
20.0
1
9
SPINOSAD
CIPERMETRINA
20.0
1
10
CIPERMETRINA
CONTROL
0.0
0
13.6 Summary: Separability Evidence
In [68]:
println("="^60)println("SEPARABILITY ANALYSIS SUMMARY")println("="^60)println("\n### Evidence that CONTROL is SEPARABLE ###")println("Binary classification accuracy: $(round(binary_result.accuracy *100, digits=1))%")println("ROC AUC: $(round(roc_result.auc, digits=3))")println("PERMANOVA (Ctrl vs Drugs) p-value: $(round(permanova_ctrl.p_value, digits=4))")println("Control silhouette score: $(round(sil_result.by_class["CONTROL"].mean, digits=3))")ctrl_separable = binary_result.accuracy >0.85&& permanova_ctrl.p_value <0.05println("\nConclusion: ", ctrl_separable ? "✓ CONTROL IS CLEARLY SEPARABLE":"⚠ Evidence is weak")println("\n### Evidence that DRUGS are NOT separable ###")println("Drug-only PERMANOVA p-value: $(round(drug_equiv.p_value, digits=4))")println("Drugs-only within/between ratio: $(round(wb_drugs.ratio, digits=3))")# Get mean silhouette for drugsdrug_sils = [sil_result.by_class[k].mean for k inkeys(sil_result.by_class) if k !="CONTROL"]mean_drug_sil =mean(drug_sils)println("Mean drug silhouette: $(round(mean_drug_sil, digits=3))")drugs_not_separable = drug_equiv.p_value >=0.05|| wb_drugs.ratio >0.7println("\nConclusion: ", drugs_not_separable ? "✓ DRUGS ARE NOT EASILY SEPARABLE":"⚠ Some drug differences detected")println("\n"*"="^60)
============================================================
SEPARABILITY ANALYSIS SUMMARY
============================================================
### Evidence that CONTROL is SEPARABLE ###
Binary classification accuracy: 88.0%
ROC AUC: 0.955
PERMANOVA (Ctrl vs Drugs) p-value: 0.0001
Control silhouette score: -0.013
Conclusion: ✓ CONTROL IS CLEARLY SEPARABLE
### Evidence that DRUGS are NOT separable ###
Drug-only PERMANOVA p-value: 0.0001
Drugs-only within/between ratio: 0.839
Mean drug silhouette: -0.017
Conclusion: ✓ DRUGS ARE NOT EASILY SEPARABLE
============================================================
14 Limitations and Future Directions
14.1 Methodological Limitations
14.1.1 Sample Size
N=5 per group is insufficient for robust statistical inference
Effect size estimates have very wide confidence intervals
High risk of Type II error (missing true effects)
Classification accuracy likely overestimated due to overfitting
Permutation tests have limited precision with small sample sizes
Impact: Results should be viewed as exploratory and hypothesis-generating, not confirmatory.
14.1.2 Parameter Selection
All preprocessing and TDA parameters were chosen heuristically without systematic optimization:
Parameter
Value Used
Justification
blur
2
Heuristic choice (not optimized)
threshold
0.1
Visual inspection (not data-driven)
sample_size
1000 points
Computational convenience (not validated)
rips_cutoff
5
Arbitrary choice (no sensitivity analysis shown)
k (KNN)
3
Standard default (not tuned)
Wasserstein
(p=1, q=2)
Not compared to alternatives
Impact: Results may be sensitive to these choices. A systematic sensitivity analysis would strengthen conclusions (recommended for future work).
14.1.3 Validation Strategy
No independent validation dataset
All reported accuracies use cross-validation on the same 25 samples
High risk of overfitting to sample-specific patterns
True generalization performance likely lower than reported
Statistical topology methods: Use recent advances for inference on persistence diagrams
Temporal dynamics: If possible, track same spiders building multiple webs
14.6.3 Biological Integration
Dose-response curves: Test multiple drug concentrations
Behavioral correlates: Link topological features to specific motor/cognitive deficits
Multi-species comparison: Test generalizability across spider families
Mechanism investigation: Use neurobiology techniques to validate TDA findings
14.7 Conclusions with Appropriate Caveats
This exploratory analysis demonstrates that:
TDA captures drug effects: CONTROL webs are topologically distinguishable from drug-treated webs
H1 features are informative: Entropy, cycle count, and persistence metrics vary systematically
Drug classes overlap: Different drugs produce similar topological signatures, suggesting common disruption pathways
Method shows promise: TDA provides interpretable, geometrically-motivated features
However, with N=5 per group and no independent validation, these findings require: - Replication on larger, independent datasets - Systematic parameter validation - Controlled experimental conditions - Biological mechanism investigation
This work is best viewed as methodological proof-of-concept rather than definitive toxicological findings.